A multimodal learning interface for word acquisition

نویسندگان

  • Dana H. Ballard
  • Chen Yu
چکیده

We present a multimodal interface that learns words from natural interactions with users. The system can be trained in unsupervised mode in which users perform everyday tasks while providing natural language descriptions of their behaviors. We collect acoustic signals in concert with user-centric multisensory information from non-speech modalities, such as user’s perspective video, gaze positions, head directions and hand movements. A multimodal learning algorithm is developed that firstly spots words from continuous speech and then associates action verbs and object names with their grounded meanings. The central idea is to make use of non-speech contextual information to facilitate word spotting, and utilize temporal correlations of data from different modalities to build hypothesized lexical items. From those items, an EM-based method selects correct word-meaning pairs. Successful learning has been demonstrated in the experiment of the natural task of “stapling papers”.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Understanding Child Language Acquisition: An Unsupervised Multimodal Neural Network Approach

This paper presents an unsupervised, multimodal, neural network model of early child language acquisition that takes into account the child’s communicative intentions as well as the multimodal nature of language. The model exhibits aspects of one-word child language such as generalisation to new and unforeseen utterances, a U-shaped learning trajectory and a vocabulary spurt. A probabilistic ga...

متن کامل

The Role of Repeated Exposure to Multimodal Input in Incidental Acquisition of Foreign Language Vocabulary

Prior research has reported incidental vocabulary acquisition with complete beginners in a foreign language (FL), within 8 exposures to auditory and written FL word forms presented with a picture depicting their meaning. However, important questions remain about whether acquisition occurs with fewer exposures to FL words in a multimodal situation and whether there is a repeated exposure effect....

متن کامل

A Localist Neural Network Model for Early Child Language Acquistion from Motherese

This paper presents a localist multimodal neural network that uses Hebbian learning to acquire one-word child language from child directed speech (CDS) comprising multiword utterances and queries in addition to one-word utterances. The model implements cross-situational learning between linguistic words used in child directed speech, the accompanying perceptual entities, conceptual relations an...

متن کامل

Prosodic Features from Large Corpora of Child-Directed Speech as Predictors of the Age of Acquisition of Words

The impressive ability of children to acquire language is a widely studied phenomenon, and the factors influencing the pace and patterns of word learning remains a subject of active research. Although many models predicting the age of acquisition of words have been proposed, little emphasis has been directed to the raw input children achieve. In this work we present a comparatively large-scale ...

متن کامل

A Computational Model for Taxonomy-Based Word Learning Inspired by Infant Developmental Word Acquisition

To develop human interfaces such as home information equipment, highly capable word learning ability is required. In particular, in order to realize user-customized and situation-dependent interaction using language, a function is needed that can build new categories online in response to presented objects for an advanced human interface. However, at present, there are few basic studies focusin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003